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Abstract 

The interaction of distinct units in physical, social, biological and technological systems 
naturally gives rise to complex network structures. Networks have constantly been in the 
focus of research for the last decade, with considerable advances in the description of 
their structural and dynamical properties. However, much less effort has been devoted to 
studying the controllability of the dynamics taking place on them. Here we introduce and 
evaluate a dynamical process defined on the edges of a network, and demonstrate that the 
controllability properties of this process significantly differ from simple nodal dynamics. 
Evaluation of real-world networks indicates that most of them are more controllable than 
their randomized counterparts. We also find that transcriptional regulatory networks are 
particularly easy to control. Analytic calculations show that networks with scale-free 
degree distributions have better controllability properties than uncorrelated networks, 
and positively correlated in- and out-degrees enhance the controllability of the proposed 
dynamics. 

The last decade has witnessed an explosive growth of interest in the descriptive analysis 
of complex natural and technological systems that permeate many aspects of everyday life 



3l 111 421. Research in network science has mostly been focused on measuring |6l 8 62 



modeling [28||44| and decomposing 22 , 43 , 47] network representations of existing natural 
phenomena in order to deepen our understanding of the underlying systems. Considerably 
less attention has been dedicated to the various types of network dynamics p9l[34l|46l[52] and 



even less to the problem of controllability 133 53 63 , i.e. determining the conditions under 



which the dynamics of a network can be driven from any initial state to any desired final 



state within finite time 26 , 32 



Structural controllability 



59[|60|. 

31| has been proposed recently as a framework for studying 



the controllability properties of directed complex networks 32 . In this framework, a linear 



time-invariant nodal dynamics is assumed on the network, governed by the following equation: 

= Ax(t) + Bu(t) (1) 

where A is the transpose of the (weighted) adjacency matrix of the network, x(t) is a time- 
dependent vector of the state variables of the nodes, u{t) is the vector of input signals, and 
B is the so-called input matrix which defines how the input signals are connected to the 
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nodes of the network. The dynamics is said to be structurally controllable if there exists a 
matrix A* with the same structure as A such that the network can be driven from any initial 
state to any final state by appropriately choosing the input signals u(t) [3l]. Here, structural 
equivalence of A and A* means that A* is not allowed to contain a non-zero entry when 
the corresponding entry in A is zero. Structural controllability is a general property in the 
sense that almost all weight combinations of a given network are controllable if the network 
is structurally controllable for a given B |31[|57j . The minimum number of input signals is 
then determined by finding a maximum matching in the network, i.e. a maximum subset of 
edges such that each node has at most one inbound and at most one outbound edge from the 
matching. The number of nodes without inbound edges from the matching is then equal to 
the number of input signals required for structural controllability [32] . 

Perhaps the most striking feature of the structural controllability approach to linear nodal 
dynamics is that input signals tend to control the hubs of the network only indirectly. In 
addition, real-world networks that seem to have evolved to control an underlying process 
(such as transcriptional regulatory networks) need many input signals f32|. This is due to 
the fact that driven nodes (i.e. those which receive an input signal directly) are not able to 
control their subordinates independently from each other. However, these results apply only 
for linear nodal dynamics. In this paper, we examine and describe a dynamics that takes 
place on the edges of the network, and show that this dynamics leads to significantly different 
controllability properties for the same real-world networks. 



1 Switchboard dynamics in complex networks 

We study a dynamical process on the edges of a directed complex network G{V, E) as follows. 
Let X = [xj\ denote the state vector of the process, where one state variable corresponds 
to each edge of the network. Let and y,^ be vectors consisting of those Xj values that 
correspond to the inbound and outbound edges of vertex i, respectively, and let Mj denote 
a matrix with the number of rows being equal to the out-degree and the number of columns 
being equal to the in-degree of vertex i. Furthermore, we assume that the dynamics can 
be influenced from the environment by adding an offset vector Uj to the state vector of the 
outbound edges of any node i. The equations governing the dynamics of the network are then 
as follows: 

y+{t) = Miyr(t) - r, ® y+(t) + atvait) (2) 

where r j is a vector of damping terms corresponding to the edges in [t) , cTj is 1 if vertex 
i is a so-called driver node and zero otherwise, and (i> denotes the entry-wise product of two 
vectors of the same size. 

We call the above the switchboard dynamics (SBD) since each vertex i acts as a small 
switchboard-like device mapping the signals of the inbound edges to the outbound edges 
using a linear operator Mj, which is called the mixing or switching matrix from now on. To 
simplify the equations, state variables and signals like y^, y~ and Uj are implicitly considered 
as time-dependent, even if the time variable t is omitted. Furthermore, note that for an edge 
V ^ w, exactly one of the coordinates of u„ affects the state of this edge, therefore we can 
simply introduce a unified input vector u where the jth element uj is simply the component 
of the offset vectors that affects edge j directly. 

In some sense, the SBD provides a simplified representation of the underlying dynamic 
processes of many real-world networks. For instance, in social communication networks, a 
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node (i.e. a person) is constantly processing the information received via its inbound edges 
and makes decisions which are then communicated to other nodes via the outbound edges. 
The inbound and outbound signals are then represented by the state variables xj, while the 
decision process is modeled by the mixing matrices Mj. 

We must also explain the motivation of introducing the offset vectors as a means of 
controlling the system instead of assuming external input signals. In most networks, one 
usually can not take control over a single edge as the connections do not always have a 
physical realization. Therefore, in order to control an edge in a network, one has to take 
control over the vertex from which the edge originates, and adjust the output vector of the 
vertex appropriately. This adjustment is represented by the term crjUj for each vertex i. 
Throughout this paper, we will be interested in determining an optimal control configuration 
for the SBD of a given network, where optimality is measured by the number of driver nodes 

First, we make a connection between the switchboard dynamics and a standard linear 
dynamical system by re- writing the equations of the switchboard dynamics (Eq. Q) in terms 
of Xi. Note that the derivative of the state of an arbitrary edge j originating in some vertex 
r and terminating in vertex s depends only on itself and on the states of edges whose head is 
r. Let us denote this latter set by TJ , simplifying our dynamical equation to 

= ^ WkjXk - TjXj + agUj (3) 

where Wkj is the element in the mixing matrix of vertex r that corresponds to edge k (as 
inbound edge) and edge j (as outbound edge), Tj is the damping term related to edge j, and 
Uj is equal to the value of the input signal affecting the state variable of edge j. Defining 
Wkj = for all k ^ Tj yields 

X = (W - T)x + Hu (4) 
where the unknown variables are as follows: 

• W = [wkj] is a matrix where Wkj may be nonzero if and only if the head of edge k is 
the tail of edge j. 

• T is a diagonal matrix with the damping terms of each edge in the main diagonal. 

• H is a diagonal matrix where the jth diagonal element is ag if vertex s is the tail of 
edge j. 

Eq. Q essentially describes a simple linear time-invariant dynamical system of the form 
X = Ax + Bu with the substitution A = W — T and B = H. It is also easy to see that 
W is the adjacency matrix of the line digraph L{G) of the original digraph G by definition. 
The nodes of L{G) thus correspond to the edges of the original network G, and each edge 
of L[G) represents a length-two directed path of G. An example network G is shown in 
Figure [T^, and its corresponding line digraph on Figure [T]3. The loop edges arising from the 
damping term — T in Eq. ^ are omitted from Figures [TJj and [TJ:, partly for sake of clarity, 
and partly because soon we will demonstrate that such edges do not change the optimal 
control configuration. 



3 



Figure 1: (a) An example network G with six vertices and nine edges. The switchboard 
dynamics takes place on the edges of the network, (b) The line graph L(G) corresponding 
to G. A linear time-invariant dynamics on the vertices of this network is equivalent to the 
switchboard dynamics on G. Node labels refer to the endpoints of the edges in G to which they 
correspond, (c) Applying the maximum matching theorem to L{G) yields disjoint control 
paths, (d) The control paths in G, mapped back from L{G). Note how each path in L{G) 
became an edge-disjoint walk in G. Numbers represent the order in which the edges have to 
be traversed in the walks. The two driver nodes are a and e since each walk starts from either 
a or e. 

2 Structural controllability of the switchboard dynamics 

Applying the maximum matching theorem of Liu et al |32| to L[G) (Figure [Tja) gives us a 
set of control paths and driven nodes in the line digraph (Figure [!}:), or equivalently, a set of 
driven edges in the original graph G. Since edges can be controlled only via the offset vectors, 
the set of driver nodes are given by collecting those vertices that have at least one outbound 
driven edge. However, note that the maximum matching theorem guarantees only that the 
number of driven nodes in L{G) will be minimal, and this does not imply that the obtained 
set of driver nodes in G is also minimal. 

Let us now compare the control paths obtained from the maximum matching in the line 
graph L{G) in Figure [ij: with the corresponding control paths in the original graph G in 
Figure [T]l. It can be seen that the maximum matching consists of vertex-disjoint open and 
closed paths (also called stems and buds) in L{G), and mapping these paths back to G yields 
ecige-disjoint open and closed walks in G. The walks together form a complete cover of the 
edges of G. Since the first vertex of each stem has to be driven in L{G), the driver nodes in G 
are those from which the corresponding open edge-disjoint walks originate. Our goal is thus 
to find a cover that minimizes the number of nodes from which open walks originate in G. 

Let us call a vertex v divergent if > d~ , convergent if d^ < d~ , and balanced if 
dy = d~ , and let us define a balanced component as a connected component consisting solely 
of balanced vertices and at least one edge. Our key result (which can also be formulated as a 
theorem) is that the minimum set of driver nodes required to control the SBD on a network 
G(y, E) can be determined by selecting the divergent vertices of G and one arbitrary vertex 
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from each balanced component. The formal proof is given in the Appendix. 

The above theorem has two important implications. First, it explains why we are safe 
to ignore loop edges in L(G): a loop edge of a vertex in G increases both its in-degree and 
its out-degree by one, thus a divergent vertex stays divergent, and a non-divergent vertex 
stays non-divergent. Second, the theorem shows that the number of driver nodes required 
to control the SBD is almost completely determined by the joint degree distribution of the 
network. This is in concordance with the results of Liu et al [32] for the linear time-invariant 
nodal dynamics. 



3 Controllability of real networks 

We have determined the set of driver nodes under the switchboard dynamics for 38 real net- 
works classified into 11 categories and compared the fraction of driver nodes ud with the 
model of Liu et al [32 j and with its expected value after different types of randomizations 
(see Table [T]) . A striking difference between the switchboard dynamics and the model of Liu 
et al |32| can be seen for two classes of networks. Regulatory networks such as the tran- 
scriptional regulatory network of E.coli (TRN-EC-2 [37J) and S.cerevisiae (TRN- Yeast- 1 [T], 
TRN-Yeast-2 [37]) and the ownership network of US telecommunications and media corpo- 
rations (Ownership-USCorp [45]) turned out to be well-controllable under the switchboard 
dynamics but they are very hard to control in the linear nodal dynamics. This can be ex- 
plained by the fundamental difference between the two models. In the linear nodal dynamics, 
a driven node may not influence its subordinates independently of each other, thus the pres- 
ence of out-hubs in a network degrades its controllability significantly. In the switchboard 
dynamics, out-hubs behave the opposite way, allowing one to control many state variables 
with a single out-hub. It follows that driver nodes prefer out-hubs in the SBD, while they are 
shown to avoid hubs in the linear nodal dynamics. Therefore, hubs have an important role 
not only in maintaining the connectivity of a network in case of random failures [5 , 15 , 24 and 



containing epidemic spreading [50,51 , but they also make it possible to control the network 
efficiently with a smaller number of driver nodes. 

The other class of networks with the largest difference between the two models is the case 
of intra-organizational networks [17 23 41 . In the model used by Liu et al, all these networks 



can be controlled by at most three nodes. On closer examination, it turns out that 75%-ou7o 
of the connections in each of these networks is reciprocal, i.e. an edge exists between vertices 
A and B in both directions. A reciprocal edge pair can easily form a bud in a maximum 
matching, requiring no driver node on its own, therefore high reciprocity in a network always 
implies a low fraction of driver nodes in the linear nodal dynamics, while this is not necessarily 
true for the SBD. 

Comparing the fraction of driver nodes for the SBD with the randomized variants re- 
veals that in most cases, the fraction of driver nodes required to control a random Erdos- 
Renyi network [l2]|20| of the same size is larger than the fraction of driver nodes for the 
real- world network, suggesting that the structure of these networks is at least partially op- 

the neural 



timized for controllability. Notable exceptions are the electronic circuits 37 



48 , and the 



network of C.elegans [T||62|, most of the World Wide Web networks [2}|4,28 
intra-organizational networks fl7[ |23[[4l] . Preserving the in- and out-degree distributions (but 
not the joint distribution) brings the fraction of driver nodes closer to the observed one after 
randomization, and keeping the joint degree distribution makes the fraction of driver nodes 
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Table 1: Controllability properties of the real networks analysed in this paper 



Type 


# 


Name 


Nodes 


Edges 


SBD 




RR 

no 


Decree 


Regulatory 


1. 


Ownership-USCorp 


7,253 


6,726 


0.160 


0.820 


0.339 


0.085 




2. 


TRN-EC-2 


418 


519 


0.222 


0.751 


0.366 


0.148 




3. 


TRN- Yeast- 1 


4,441 


12,873 


0.034 


0.965 


0.415 


0.033 




4. 


TRN-Yeast-2 


688 


1,079 


0.177 


0.821 


0.381 


0.137 


Trust 


5. 


College* 


32 


96 


0.344 


0.188 


0.418 


0.315 




6. 


Epinions* 


75,888 


508,837 


0.336 


0.549 


0.445 


0.448 




7. 


Prison* 


67 


182 


0.403 


0.134 


0.411 


0.451 




8. 


Slashdot* 


82,168 


948,464 


0.323 


0.045 


0.458 


0.392 




9. 


WiKi Vote 




iUo,Doy 


n OQ 1 
U.zoi 


U.DDD 


U.4DO 


o ^ion 
U.DzU 


Food web 


10. 


Grassland 


88 


137 


0.318 


0.523 


0.381 


0.297 




ii. 


Little KocK 


loo 


2,494 


U.6o9 


n c /1 1 
U.o4i 


U.4d3 


n a An 
U.D49 




12. 


Seagrass 


49 


226 


0.449 


0.265 


0.436 


0.433 




13. 


Ythan 


135 


601 


0.304 


0.511 


0.432 


0.337 


Metabolic 


1 A 

14. 


C. elegans 


1 1 'yo 

i,i/o 


O on A 

2,864 


U.182 


rt ono 
U.3U2 


n Ann 
U.4U9 


n onn 
U.3U9 




io. 


ti. coll 


2,2/(3 


0,106 


n 1 oo 
U.182 


n ooo 
U.382 


n Ann 
U.4U9 


n onn 
U.3U9 




16. 


S. ccvevisicLC 


1,511 


3,833 


0.185 


0.329 


0.409 


0.313 


Electronic 


1 ^ 

1 1. 


szUSa 




1 on 
189 


U.4ol 


rt ooo 


n oo 1 
0.381 


n AO A 

U.4J1 


circuits 


1 o 

18. 


s420a 


oco 


399 


n /ICC 


n oo A 
U.2o4 


n oo c 


n A An 
U.44U 




19. 


s838a 


512 


819 


0.459 


0.232 


0.381 


0.442 


Neuronal 


on 


C. elegans 


zy / 


z,ooy 


u.o4y 


U.lDO 


U.44y 


u.4yy 


and brain 




Macaque 


40 


40 O 


U.ooo 


n noo 


U.44D 


n A K7 


Citation 


22. 


arXiv-HepPh* 


34,546 


421,578 


0.356 


0.232 


0.459 


0.577 




zo. 


ar A.iv-xiep i n 




QCO Qn7 


u.ooy 


U.ZID 


n /I fin 


u.ooy 


WWW 


24. 


Google 


15,763 


171,206 


0.670 


0.337 


0.457 


0.612 




25. 


Polblogs 


1,490 


19,090 


0.509 


0.471 


0.460 


0.501 




26. 


nd.edu 


325,729 


1,497,134 


0.271 


0.677 


0.433 


0.301 




97 


o LcLlliUiU. cU. U. 








n "^1 7 






Internet 


28. 


p2p-l 


10,876 


39,994 


0.334 


0.552 


0.425 


0.344 




29. 


p2p-2 


8,846 


31,839 


0.344 


0.578 


0.423 


0.344 




30. 


p2p-3 


8,717 


31,525 


0.343 


0.577 


0.424 


0.344 


Social 


31. 


Twitter*"!" 


41.7 X 10^ 


1.47 X 10^ 


0.402 




0.476 


0.434 


communication 


32. 


UCIOnline 


1,899 


20,296 


0.216 


0.323 


0.456 


0.375 




33. 


WikiTalk 


2,394,385 


5,021,410 


0.022 


0.968 


0.399 


0.026 


Organizational 


34. 


Consulting* 


46 


879 


0.522 


0.043 


0.458 


0.460 




35. 


Freemans-1* 


34 


645 


0.412 


0.088 


0.441 


0.476 




36. 


Freemans-2* 


34 


830 


0.588 


0.029 


0.439 


0.465 




37. 


Manufacturing* 


77 


2,228 


0.597 


0.013 


0.468 


0.424 




38. 


University* 


81 


817 


0.519 


0.012 


0.451 


0.532 



Notations are as follows: fraction of driver nodes under the switchboard dynamics (n|)^^) and the simple 
nodal dynamics |32j (n^'^); fraction of driver nodes under the switchboard dynamics in randomized networks 
using the Erdos-Renyi model [rJf^) and the degree-preserving configuration model (n^^^'^^^). Note that this 
latter model does not preserve the joint degree distribution. Results for null models are averaged from 100 
randomizations. Networks where the edges were reversed compared to the original publication are marked by 
* (see Appendix, section C.ll. Results calculated directly from the degree distribution (i.e. not taking into 



account balanced components) are marked by f . 
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practically the same up to a difference of ±0.002 in the networks we have studied, confirming 
that the effect of balanced components on the fraction of driver nodes is indeed negligible for 
large real- world networks. Edge deletion experiments (see Appendix) also indicate that the 
optimal control configurations in the studied networks are robust to single link failures as the 
networks mostly remain controllable with the same number of driver nodes after the removal 
of a single edge. 



4 Analytical results for model networks 

The dependence of ud on the joint degree distribution allows us to derive analytical formulae 
for the expected fraction of driver nodes for a wide variety of model networks (see Appendix 



for the exact derivations). For Erdos-Renyi digraphs [12,20 with n vertices and an edge 
probability of p, ud is given as follows: 

ng^ = 2-^^o(2(fc» (5) 

where {k) = np is the average in- and out-degree and Ia{x) is the modified Bessel function 
of the first kind. The function converges rapidly to 0.5 as {k) increases. Similar results are 
obtained for graphs with independent exponential in- and out-degree distributions Ce~^l^ 
where k = 1/ log ^^i^ '■ 

=2WTT 



which also approaches 0.5 rapidly as (fc) — t- oo (Figure[2p). For power-law distributed digraphs 
|8|[l3j with P(ci+ = k) = P{d- = k) = Cfc-Te-*^/", no is given by 

1 T ■ ( —2/k\ 

power ^ ^-'i2'y\^ ) 

" 2 "2Li^(e-V«)2 

where Lis(z) is the base s polylogarithm function. As k — t- oo, this converges to 

power 

^ "2 2C(7)2 

(where C(x) is the Riemann zeta function) in the absence of any exponential cutoff (Figure[2|3). 
The Appendix also contains the analytical treatment of fc-regular networks. 

It is worthwhile to compare these analytical results with that of Liu et al |32| , who 
have found that the fraction of driver nodes tid decreases for both Erdos-Renyi and scale- 
free networks as {k) — )• oo, while these networks behave the opposite way under the SBD. 
For {k) — )• oo, the fraction of driver nodes tends to 1/2 for Erdos-Renyi networks and to 
1/2 — C(27)/(2(^(7)^) for scale-free networks. The consequence is that denser networks are 
harder to control (as expected by our intuition), and that scale-free networks with a given {k) 
are easier to control than an Erdos-Renyi network with the same average degree. This can 
partly be attributed to the higher frequency of short loops |10| in scale-free networks: these 
loops can be covered by closed walks and do not require extra driver nodes. 
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Figure 2: (a) Expected fraction of driver nodes n^) in Erdos-Renyi (ER) and exponential 
(Exp) networks as a function of the average degree {k). (b) Expected fraction of driver 
nodes n/j in scale- free networks with exponential cutoff as function of the exponent 7 of the 
degree distribution, for different cutoff values k. On both panels, symbols denote the results 
of simulations on networks with 10^ nodes, solid lines correspond to the analytical results. 

5 The effect of degree correlations 

Our analytical results assumed that the in-degree and the out-degree of a node is uncorrelated, 
which was true for all of the model networks we have studied. However, one-point degree 
correlations in real networks are significantly different from zero [9]. To study how such 
correlations affect the fraction of driver nodes, we have performed simulations on Erdos- 
Renyi networks and scale-free networks with an exponential cutoff and varied the correlation 
as follows. First, we generated an instance of the network model with n = 10^ nodes and 
calculated the in- and out-degree sequences. These instances were uncorrelated since neither 
the Erdos-Renyi model nor the configuration model (which we have used to generate scale- 
free networks) introduces correlations between the in- and out-degree of the same node. Next, 
while keeping the in-degree sequence intact, we started swapping elements in the out-degree 
sequence randomly such that only those swaps were performed which increased the correlation. 
The process was continued until we were not able to increase the correlation any more in the 
last t steps (where t = 10^ in our simulations). A similar greedy algorithm was executed 
from the original degree sequences in the opposite direction, performing swaps only if it 
decreased the correlation, terminating when it was not possible to decrease the correlation 
any more in the last t steps. The fraction of driver nodes no was then calculated in the 
original configuration and whenever the absolute difference of the calculated in- and out- 
degree correlation between the last examined state and the current state became larger than 
0.01. 

The results are depicted in Figures [3^ and[3]3, both of which clearly show a general trend: 
increasing the correlation between the in- and out-degrees decreases the fraction of driver 
nodes. Negative one-point correlations yield a higher fraction of driver nodes since these 
networks are very unlikely to contain balanced nodes: a vertex either has a high in-degree 
and a low out-degree or a high out-degree and a low in-degree. In other words, negative 
correlations indicate a clear separation of responsibilities between the nodes of the network: 
divergent nodes are strongly divergent with a large difference between the out-degree and 
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Figure 3: (a) Fraction of driver nodes n^j in Erdos-Renyi (ER) networks with different 
average degree (A:) as a function of in- and out-degree correlation (/>). (b) Fraction of driver 
nodes n/j in scale- free networks with different exponents 7 as a function of in- and out-degree 
correlation (p). On both panels, every fifth data point is marked by a symbol. Each data 
point was obtained by averaging at least 20 different realizations of the network model; error 
bars were omitted as they were smaller than the symbols. Note that it is very hard to 
introduce negative degree correlations in the case of scale-free networks and none of our test 
runs managed to decrease the correlation below -0.2. 

the in-degree, while convergent nodes are strongly convergent. Positive correlations indicate 
that nodes often represent complex decision processes which map a high-dimensional input 
space into a similarly high-dimensional output space. Strong positive correlations also yield 
networks with a higher number of short loops [o] , which can then be covered by closed walks 
that do not require driver nodes on their own. 



6 Conclusions 

We have presented a linear time-invariant dynamical model where state variables correspond 
to the edges of a directed complex network, and the nodes of the network act as linear oper- 
ators that map state variables of inbound edges to outbound edges. We have demonstrated 
that the minimum number of driver nodes for such systems is largely determined by the 
joint degree distribution of the network. A comprehensive survey of 38 real-world networks 
showed that transcriptional regulatory networks are well-controllable with a small number of 
driver nodes under the switchboard dynamics, and that most real-world networks are easier 
to control than random Erdos-Renyi networks with the same number of nodes and edges. 



This is very different from the findings of Liu et al 32 who have reported a high fraction 
of driver nodes under linear nodal dynamics on regulatory networks and that randomized 
Erdos-Renyi networks are easier to control than the real- world ones. The results suggest that 
one should choose the dynamical model carefully before studying the controllability properties 
of a real- world network as it may affect the results to a very large extent. 

The behaviour of the nodal and edge dynamics is markedly different in highly hierarchi- 
cal, tree-like networks where the presence of central out-hubs rapidly increase the required 
number of driver nodes for the linear nodal dynamics of Liu et al, while the same out-hubs 
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allow efficient control of many subordinate nodes and thus decrease the required number of 
driver nodes in the switchboard dynamics. Such hierarchies are ubiquitous in nature and 
society, from scales as small as gene regulatory networks 0[37j, through leader-follower re- 



lationships of flocking pigeons 39 , to the large-scale organization of some man-made social 



structures like the Wikipedia talk network 29 or the ownership network of US media and 



telecommunications corporations 45 . The presence or absence of hierarchy thus seems to be 
an important contributing factor of the controllability properties of large dynamical systems. 

As it happens so often in scientific research, the framework we have presented raises more 
questions than answers. For instance, it is yet unknown how the switchboard dynamics would 
behave in the presence of noise or nonlinearity, or in cases when it is enough to control only a 
subset of the state variables (output controllability) and only ensure that the uncontrollable 
ones have stable dynamics (stabilizability). However, as we have shown, even the first steps 
along our approach could be used to deepen our understanding of the origins of controllability 
of real- world networks. 
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Appendices 



A Structural controllability 
A.l Controllability conditions 

A continuous linear dynamical system of the form x = Ax, A G M"^" is said to be controllable 
by a set of piecewise continuous input signals u if we are able to drive the state vector x from 
any arbitrary initial state to any given state in ffiiite time, assuming that the input signals 
are injected to the linear system according to the following dynamical equation: 

X = Ax + Bu (9) 

where B G ]g"x™ g, matrix that describes how the input signals affect the derivatives of the 
state variables. A is usually called the state matrix and B the control matrix. Note that the 
structure of B is not constrained in any way; one can connect any of the input signals to any 
of the state variables. 

The Kalman rank condition states that the system is controllable if and only if the con- 
trollability matrix [B AB A^B . . . A"~-'^B] has rank n, where n is the number of state 



variables 26,60 . However, the rank condition is not constructive since it does not tell us how 
to find an appropriate B for a given A (preferably with a minimum number of columns), and 
even testing the Kalman rank condition is computationally expensive and numerically unsta- 
ble for large n. Due to these difficulties, control theorists turned to the concept of structural 



controllability, first introduced by Lin in his seminal paper 31 . 

In the structural controllability framework, one assumes that the matrices A and B contain 
two kinds of elements: fixed zeros and free parameters. The free parameters of the matrices 
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may assume any real value and are independent of each other. A system with state matrix 
A and control matrix B is then structurally controllable if it is possible to set the free 
parameters of the matrices in a way that the system becomes controllable in the usual sense. 
It can also be shown that if a system is structurally controllable, then it is also controllable 



for all parameter values except a set of combinations with zero Lebesgue measure [31 57 
in other words, structural controllability is a general property of the system and it implies 
controllability for almost all combinations of the free parameters. 

Sufficient and necessary conditions for the structural controllability of linear time- invariant 
dynamical systems with known state and control matrices A = [uij] and B = [bij] were 



given earlier in the literature [38, 54 . The graph-theoretic formulation of the condition 
is given as follows. Let G{V, E) the graph representation of the dynamic system, where 

V = {xi,X2, . . . ,Xn,Ui,U2, . . . ,Um}-, E = Ea'JEb, Ea = { (Xj , Xj) | Oji / 0} and^B = 

{{ui, Xj)\bji 7^ 0}. Similarly, let G* {V* , E*) be the so-called bipartite graph representation of 
the system, where V* = VrUVcUVu, Vr = {xf,x^, . . . Vc = {x^,X2, . . • Vu = 

{ui,U2,...,Um}, E* = E\\JE%, E\ = |(x+,3;T)|ajj / o| and S|j = xj)|6ji / o|. 
Note that there exists a bijection between E and E*: {xi,Xj) in E is equivalent to (xf,x~) 
in E*, and {ui,Xj) in E is equivalent to {ui,xj) in E*. The system is then structurally 
controllable if and only if the following two conditions hold: 

1. For all V £ V, V IS reachable from at least one of {ui,U2, • • . , Um} via directed paths in 
G. This is called the reachability condition. 

2. G* contains n independent edges, where a set of edges is independent if every vertex in 
V* is incident on at most one of the edges. 

The bijection between E and E* means that the set of independent edges satisfying the 
above conditions selects n edges from E such that each vertex v £ V is incident on at 
most one inbound and at most one outbound selected edge. For sake of simplicity and also 
to conform with the terminology introduced earlier by Liu et al [32], we will call a set of 
edges satisfying this condition a matchinfj^ vertices with inbound selected edges matched 
and vertices without such edges unmatched. Note that all input vertices Ui are unmatched 
since they have no inbound edges, and no ordinary vertex Vi will be unmatched because 1) 
we have selected n independent edges, 2) there are exactly n ordinary vertices, 3) we know 
that none of the input vertices are matched, and 4) a selected edge can make only one vertex 
matched. 

The selected edges form vertex-disjoint directed paths and cycles in G. The directed 
paths are called stems and they always originate from one of the input vertices Uj (since 
the first vertex of a stem is unmatched and only the input vertices are unmatched). The 
directed cycles are called buds. Stems and buds together form the set of control paths, since 
we can think about them in an informal way as principal routes along which control signals 
propagate in the system. A peculiar property of buds is that they do not require a separate 
control signal: if any vertex of a bud is adjacent to the vertex of a stem, then the stem itself 
will be responsible for providing the appropriate input to the vertices of the bud as well. Note 



that due to the reachability condition (see page 11 ) there can be no bud in the system that is 



^The definition of "matching" in this manuscript is not to be confused with matchings on undirected graphs, 
where it is required that the selected edges share no common vertices. In this manuscript, "matching" always 
refers to a directed matching as defined above. 
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not adjacent to any of the stems since each vertex is accessible from at least one input vertex, 
and each input vertex is the root of one of the stems. 

The above conditions can only be used to check whether the system is structurally con- 



trollable once the control matrix B is known. In a recent paper 32 , Liu et al have proven 
the maximum matching theorem, which states that the minimum number of input signals 
required to control a system represented by its state matrix A can be determined by finding 
a maximum matching in A or a maximum set of independent edges in the bipartite repre- 
sentation of A (assuming no input vertices), and then counting the number of unmatched 
vertices. The maximum matching theorem also constructs the matrix B in a way that the 
bipartite representation of the system with state matrix A and control matrix B will satisfy 
the above conditions for structural controllability. This is achieved by connecting an input 
signal to every unmatched vertex of the graph to form one stem for each such vertex, and also 
connecting these input signals to any buds that are not adjacent to stems in order to satisfy 
the reachability condition. The nodes of the original network that are connected directly to 
one of the input signals are called driven nodes, and the nodes of the input signals are called 
driver nodes. 

It is important to emphasize the distinction between driver and driven nodes, since the 
difference between them may be arbitrarily large. The reason for this is that one driver 
node may drive more than one driven node. To show that this is not just a rare theoretical 
possibility, we refer the Reader to a recently published manuscript of Cowan et al [l6] , where 
the authors argue that the dynamic equations of real networks usually include a damping term 
for each state variable. These damping terms ensure that the system returns to some ground 
state in the absence of external stimuli. The damping terms are represented by nonzero 
diagonal elements in A and by self-loops in the network representation of the system. When 
all the nodes in the network are equipped with self-loops, a trivial maximum matching can 
be obtained by selecting the self-loops only, thus constructing n buds. According to the 
maximum matching theorem of Liu et al, the system then requires a single driver node only, 
which will be connected to all the nodes of the network. Thus, the number of driver nodes will 
be 1 and the number of driven nodes will be n, attaining the maximum possible difference 
of n — 1. Furthermore, the theorem then states that every real-world network with such 
self-loops can be driven by a single input signal. 

The distinction between driver and driven nodes is fundamental in the simple linear time- 
invariant nodal dynamics assumed by Liu et al, but not in the switchboard dynamics. In the 
switchboard dynamics, driver nodes are internal to the system: these are the nodes whose 
mixing matrix M, is controlled in order to drive the state variables of the edges into the desired 
state. Choosing all the self-loops in the line graph L(G) of the switchboard dynamics would 
simply promote every node of the original network with at least one outbound edge to a driver 



node. Later on in Section A. 2, we will prove that this is not necessarily an optimal solution 



and also show a linear-time algorithm that selects the optimal driver node configuration. 
A. 2 Proof of our key result 

For sake of clarity, we repeat some definitions from the main part of the manuscript here. 

Definition 1 (Divergent vertex). A vertex v in a digraph G(y,E) is divergent if d^ > d~ , 
where d^ is the out-degree and d~ is the in-degree of the vertex. 

Definition 2 (Convergent vertex). A vertex v in a digraph G{V,E) is convergent ifd^ < d~ . 
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Definition 3 (Balanced vertex). A vertex v in a digraph G{V,E) is balanced if = d^ . 

Definition 4 (Balanced component). A connected component C V in a digraph G(y,E) 
is a balanced component if v is balanced for every v £ C and C contains at least one edge. 

We will also need a few more definitions and lemmas: 

Definition 5 (Edge-disjoint walk). An edge-disjoint walk of a digraph G{V,E) is a sequence 
of vertices vo,vi, . . . Vn such that Vi — )• fj+i is a member of E for every < i < n and each 
such edge appears in the walk only once. Such a walk is open ifvo ^ Vn and closed otherwise. 

Lemma 1. For every connected component C of a digraph G{V,E), exactly one of the 
following three statements is true: 

1. G contains no edges. 

2. G contains at least one convergent and at least one divergent vertex. 

3. G is balanced. 



Proof. Proving that at most one of the three statements can be true at the same time is trivial 
and follows from the definitions above. To complete the proof, we must also show that at least 
one of the statements must always be true. This is done by contradiction. Suppose that there 
exists a connected component G in some graph G{y, E) for which none of the three statements 
holds. G then either contains at least one convergent vertex and no divergent vertices, or at 
least one divergent vertex and no convergent vertices. Both cases are contradictory since the 
sum of in-degrees in any connected component C must be equal to the sum of out-degrees, 
and balanced vertices contribute the same amount to both sums. □ 

Lemma 2. For every connected component G '^V of a digraph G{V,E) containing at least 
one edge, one of the following two statements is true: 

1. G can be covered by a single closed edge-disjoint walk. 

2. G can be covered by a set of open edge-disjoint walks. 



Proof. Lemma [T] states that for non-empty connected components, the component is either 
balanced or contains at least one divergent vertex. If the component is balanced, the in-degree 
of each vertex is equal to the out-degree, hence it is always possible to construct an Eulerian 



circuit in it using Hierholzer's algorithm 21 . An Eulerian circuit is a closed edge-disjoint 
walk by definition, thus G satisfies case 1. 

If G is not balanced, there exists at least one divergent vertex in G. We then construct a 
set of open walks using the following algorithm: 

1. Select an arbitrary divergent vertex v. If there are no divergent vertices in the compo- 
nent, go to step 4. 

2. Build a walk by following an arbitrary outgoing edge of the current vertex repeatedly 
until the walk gets stuck in a vertex w, while making sure that each edge is included in 
the walk only once. 
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3. Store the walk, remove its edges from the component and go back to step 1. Note that 
the walk is always open (v ^ w) since v has more outbound edges than inbound ones, 
hence the walk cannot get stuck in v. 

4. At this step, there are no more divergent vertices in C. By Lemma[T| this implies that all 
the vertices are balanced. Since C may have fallen apart into multiple connected com- 
ponents after the edge removals, construct an Eulerian circuit for each sub-component 
of C and store it closed walk. 

The above algorithm provides us with a cover of C with edge-disjoint open and closed 
walks. However, note that each closed walk can be eliminated by finding an open walk with 
which it shares a vertex v and joining them together in a larger open walk which traverses 
the original open walk from the beginning until it arrives at v, then traverses the closes walk, 
and resumes the original open walk at v again. Repeating this procedure for every closed 
walk in the cover provides us with a final cover containing open walks only. This corresponds 
to case 2 in the lemma and concludes our proof. □ 

Our key result is then as follows: 

Theorem 1. The minimum set of driver nodes required to control the switchboard dynamics 
on a network G{V, E) can he determined by selecting the divergent vertices of G and one 
arbitrary vertex from each balanced component. 

Proof. The proof will proceed as follows. First, we provide an algorithm which constructs an 
edge cover in G such that each open walk originates in a divergent node and each balanced 
component is covered by a single closed walk, giving an upper bound on the minimum number 
of driver nodes. Next, we show that every divergent node must be driven in G in any control 
configuration, and that one arbitrary vertex from each balanced component must also be 
driven, providing a lower bound on the minimum number of driver nodes. We then show that 
the upper and lower bounds coincide, therefore our algorithm is optimal. 

We have already shown in the main part of this manuscript that the switchboard dynamics 
on G is equivalent to a linear time- invariant dynamics on the nodes L{G), for which a set of 
driver and driven nodes can be determined using the maximum matching theorem of Liu et 
al [32] . The maximum matching theorem states that a given matching in L{G)* yields a set of 
stems (directed vertex-disjoint paths) and buds (directed vertex-disjoint cycles) in L{G), and 
the roots of the stems (i.e. the first vertices in the order of traversal) have to be controlled by 
external signal^ Buds do not require separate driver nodes because they are either adjacent 
to a stem (and thus use the signal from the stem) or one of the nodes in the bud is connected 
to an already existing input signal directly. The driven nodes will be the roots of the stems 
and an arbitrarily chosen vertex in each bud that is not adjacent to a stem. 

Each non-loop edge in the line digraph L{G) corresponds to a length-two path in G. This 
implies that each stem in L{G) corresponds to a concatenation of length-two paths, yielding 
an edge-disjoint walk on G, which may contain the same vertex multiple times but may not 
traverse the same edge twice. Similarly, buds not containing a loop edge in L{G) correspond 
to edge-disjoint closed walks on G, and buds consisting of a single loop edge in L{G) yield 
a single open path of length 1 in G. Since each vertex in L(G) participates in either a stem 

^In case of a non-maximum matching, some of the nodes have no incident edges selected in the matching; 
these nodes can be considered as stems on their own. 
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or a bud (but not both at the same time), mapping the stems and buds in L{G) back to 
G provides us with a cover of G using edge-disjoint closed and open walks. Note that the 
mapping is injective: an edge-disjoint walk in G can also be mapped back uniquely to a stem 
or a bud in L{G). Therefore, a matching in L{G) is completely equivalent to a cover of G 
with edge-disjoint walks, and we are free to work with either of them. 

A possible cover of edge-disjoint walks can be obtained using the algorithm described in 
Lemma [2j Such a cover creates a closed walk for every balanced component and a set of open 
walks for every non-balanced component. Mapping the walks to L{G) gives us a set of stems 
and buds: 

• Closed walks will become buds without loop edges in L{G). 

• Open walks with at least two edges become stems in L{G). 



Open walks with a single edge will correspond to the appropriate loop edge in L{G), 
thus becoming a bud with a single loop edge. 



Together, these stems and buds form a set of control paths. Each stem requires an input 
signal, hence the first vertex of each open walk in G (i.e. every divergent vertex) will have 
to be driven. Since closed walks occur exclusively within balanced components, and each 
balanced component contains only one closed walk, the buds corresponding to them will not 
be adjacent to any of the stems in L(G), hence they will also have to be connected to some 
input signal directly. The only way we are allowed to achieve this in case of the switchboard 
dynamics is to promote one of the nodes in the bud to a driver node. Therefore, one driver 
node will be required for every balanced component and for every divergent node of G. We 
have thus obtained an upper bound on the number of driver nodes in G. To prove that the 
algorithm in Lemma [2] is optimal and conclude the proof, we will show that this is also a 
lower bound. 

Assume that there exists a complete cover of the edges (i.e. a control configuration) of 
G and there exists a divergent node v such that v is not a driver node. Since v is not a 
driver node, there is no open walk originating from it. Let us now consider all the walks v 
is a part of. Closed walks enter and leave v the same number of times. Since no open walk 
originates from v, each open walk either enters and leaves v the same number of times, or 
terminates in v. Therefore, the number of covered inbound edges of v must be equal to or 
larger than the number of covered outbound edges of v. However, since v is divergent, it has 
more outbound edges than inbound edges, therefore at least one outbound edge is not covered. 
This contradicts our assumption that we are working with a complete cover. Therefore, by 
contradiction, we have shown that every divergent vertex of G must be a driver node in any 
control configuration. 



Due to the reachability condition (see page 11) , we must also ensure that there is at least 



one driver node in every connected component not containing a divergent vertex. Lemma [T] 
states that every connected component C of G is empty, balanced, or contains at least one 
convergent vertex. To satisfy the reachability condition, we must therefore promote one of 
the vertices in every balanced component to a driver node. Therefore, a lower bound on the 
number of driver nodes in G is the number of divergent nodes plus the number of balanced 
components of G. Since the lower and upper bounds coincide, our algorithm is optimal. This 
concludes our proof. □ 



15 



Note that the algorithm given above allows one to determine the minimum set of driver 
nodes for the switchboard dynamics on an arbitrary graph G{V, E) in 0{n + m) time (where n 
is the number of vertices and m is the number of edges): building the edge-disjoint walks takes 
0{m) time (since each edge has to be evaluated only once), calculating the connected compo- 
nents takes 0{n + m), and an additional 0{n) step can decide which connected components 
are balanced. 



B Analytical results 

In the main part of this manuscript, we have presented analytical formulae for the expected 
fraction of driver nodes in Erdos-Renyi, exponential and scale-free networks. These formulae 
are based on the fact that the fraction of driver nodes depends almost completely on the 
joint degree distribution of the network according to Theorem [T] By neglecting the possible 
existence of balanced components, the fraction of driver nodes for graphs with a joint degree 
distribution Y*[d^ = i,d^ = j) = pij is simply given by 

oo oo 
i=0 j=i+l 

i.e. one simply has to calculate the sum of joint probabilities for cases when the in-degree 
is smaller than the out-degree. When the in- and out-degrees are uncorrelated and identi- 
cally distributed (as in all the model networks we have presented in the main part of this 
manuscript), it is also true that pij = pji, hence can also be written as 

-| v^oo 

UD = ^~^^=0Pkk ^^^^ 

i.e. the fraction of driver nodes is equal to half the probability of non-balanced nodes. The 



formulae presented in the main part of this manuscript are all based on Eq. 1 1 by substituting 
the actual degree distribution of the network model in question. 

B.l Erdos— Renyi digraphs 

For Erdos-Renyi digraphs with n vertices and an edge probability of p, both the in- and 
out-degrees follow a Poisson distribution with (k) = np, hence ud is given as follows: 

-T.%^^~'''n = I ^0(2 ik))) (12) 

k=0 ' ' / 

where Ia{x) is the modified Bessel function of the first kind. An equivalent derivation follows 
from the fact that the difference of the in- and out-degree of a node follows a Skellam dis- 
tribution [58j , thus the probability of balanced nodes is equal to the value of the probability 
moment function of Skellam((A;) , {k)) at x = 0. 

B.2 Exponential networks 

In exponential networks, in-degrees and out-degrees are assumed to be distributed with 
P((i+ = k) = P{d- = k) = Ce-^l'' where C = \- e'^/'^ and k = 1/logi^. The ex- 
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pected value of ud then follows from simple algebraic manipulations: 



exp 

Tin = - 
^ 2 



1 - g e-V. j . _ . 1 _ J (13) 

1 + e-i/K (A;) + 12(fe) + l 2(A:) + 1 ^ ' 

where in the penultimate step we have made use of e~^^'^ = j^^- 

B.3 Power-law networks 

In this case, we distinguish between networks with a power-law-like distribution that has an 
exponential cutoff of the form P(d^ = k) = P(d~ = k) = Ck~'^e~^^'^ , and pure power-law 
distributions without a cutoff that follow P{d^ = k) = P{d~ = k) = Ck~'^ . The exponential 
cutoff makes it possible to normalize the distribution for a given average degree. We will start 
with the former case and then show how n£) behaves as the exponential cutoff vanishes (i.e. 
K oo). 

In the general case, n^) is given by 



n 



power 
D 



\ {l-C± .-^-e-/^) = 1 (l - L,,(e-/^)) = i - (15) 



i=0 



since we know that C = Li^(e "^/'^), where Lis(z) is the polylogarithm function. For z = 1, 
the polylogarithm reduces to the Riemann zeta function, yielding 



power _ 1 _ C(27) /-.gx 

for pure power-law networks. 



B.4 A;-regular networks 

The three network models presented so far produce balanced components with a very low 
probability, hence we were safe to ignore such components in our analytical calculations. In 
this section, we present similar calculations for networks where the in- and out-degree of each 
vertex is fc/2 for some even k. These networks consist of balanced nodes only, and the number 
of driver nodes is given by the number of connected components of the graph containing at 
least one edge. 

Theorem 2. In a k-regular directed network Giy, E) with n vertices, the number of driver 
nodes is zero if k = 0, one if k > 4 and the nth harmonic number Hn if k = 2. 

Proof. The case of /c = is trivial: there are no edges to control and hence the fraction of 
driver nodes is zero. For /c > 4, dropping the arrowheads gives us an undirected fc-regular 
graph where it can be proven that it is almost surely A:-connected [l2], implying that the 
original digraph requires only one driver node. For k = 2, each vertex has exactly one 
inbound and one outbound edge, thus the entire graph consists of disjoint directed cycles. 
By denoting the head of the outbound edge of vertex v by it{v), we obtain a permutation vr 
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on the vertices of the graph, and the number of connected components will be given by the 
number of cycles in tt. 

Let us call a sequence of elements ui,U2, - ■ ■ ,Um an m-cycle of vr if each m is equal to 
some Vj and it holds that 7r(«i) = U2,tt{u2) = U3, . . . ,7r(tim) = ui. First we prove that the 
probability of the event that vi is a part of an m-cycle is 1/n. We require that ui = vi, 
U2 = vr('Ui) / Us = 7r(u2) / wi, . . . 7r(tim) = vi. Therefore, 

n— In — 2n — 3 n — m 1 1 

P{vi is in an m-cycle) = • • • = — 

n n— In — 2 n — m + ln — m n 

Of course the above proof applies to every Vi £ V. Since each Vi is a part of an m-cycle 

with probability 1/n, the expected number of vertices being part of an m-cycle is exactly 

1, and since an m-cycle contains m vertices, the expected number of m-cycles is 1/m. The 

expected number of cycles of any length £ then follows by a simple summation: 

n 

^-^ m 

m=l 

This concludes our proof. □ 

Since Hn scales approximately as logn, the fraction of driver nodes will scale as log n/n 
and tend to zero as n — >• 00. /c-regular graphs are thus extremely well-controllable in the 
infinite limit, requiring 0(1) driver nodes if /c 7^ 2 and O(logn) driver nodes if A; = 2. 



C Computational results 
C.l Data sources of real networks 



The details of the real- world networks we have studied are presented in Table C.l Note that 
the semantics of the switchboard dynamics requires that a directed A ^ B edge represents 
a direct influence of j4 on and not the other way round, hence we had to reverse the edge 
directions in some of the networks to make it conform to this semantics. For instance, an 
B edge in a trust network usually means that A trusts hence B has a direct influence 
on A. For sake of clarity, the table includes the semantics of each edge. 



C.2 Robustness of control configurations 

To study the robustness of real networks against random control path failures, we have clas- 
sified each edge according to the change in the number of driver nodes when the edge is 
removed from the network. We distinguish three cases and accordingly three classes of edges. 
The removal of a critical edge increases the number of driver nodes required to maintain con- 
trollability. Conversely, the removal of a so-called distinguished edge decreases the number of 
driver nodes. The remaining edges are called ordinary since their removal does not affect the 
set of driver nodes. 

Figure |4] shows the fraction of critical, ordinary and distinguished edges in each studied real 
network, indicating that most networks possess only a small fraction of critical or distinguished 
edges, thus exhibiting a high degree of robustness against changes in control configurations due 
to random edge removals. The two significant exceptions are the electronic circuit networks 
(s208a, s420a and s838a) [37], which contain a high fraction of distinguished edges, and the 
metabolic networks |25], where almost half of the edges are critical. 
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Type 


# 


Name 


n 


m 




Regulatory 


1. 


Ownership-USCorp 


7,253 


6,726 


45 




2. 


TRN-EC-2 


418 


519 


37 




3. 


TRN- Yeast- 1 


4,441 


12,873 


7 




4. 


TRN-Yeast-2 


688 


1,079 


?r 


Trust 


5. 


College* 


32 


96 


^6 




6. 


Epinions* 


75,888 


508,837 


55 




7. 


Prison* 


67 


182 






8. 


Slashdot* 


82,168 


948,464 


30 




9. 


WikiVote* 


7,115 


103,689 


29 


Food web 


10. 


Grassland 


88 


137 


« 




11. 


Little Rock 


183 


2,494 


35 




12. 


Seagrass 


49 


226 


14 




13. 


Ythan 


135 


601 


« 


Metabolic 


14. 


C. elegans 


1,173 


2,864 






15. 


E. coli 


2,275 


5,763 


25 




16. 


S. cerevisiae 


1,511 


3,833 


25 


Electronic 


17. 


s208a 


122 


189 


37 


circuits 


18. 


s420a 


252 


399 


37 




19. 


s838a 


512 


819 


37 


Neuronal and 


20. 


C. elegans 


297 


2,359 


1 i 


brain 














21. 


Macaque 


45 


463 


40 


Citation 


22. 


arXiv-HepPh* 


34,546 


421,578 


2^ 




23. 


arXiv-HepTh* 


27,770 


352,807 


28 


WWW 


24. 


Google 


15,763 


171,206 


48 






-IT UiUiUJi^b 


1 4Qn 


1 Q riQn 


2 




26. 


nd.edu 


325,729 


1,497,134 


1 




27. 


stanford.edu 


281,904 


2,312,497 


28 


Internet 


28. 


p2p-l 


10,876 


39,994 


28 




29. 


p2p-2 


8,846 


31,839 


28 




30. 


p2p-3 


8,717 


31,525 


28 


Social 


31. 


Twitter* 


41.7 X 10 


1.47 X 10 


27 


communication 


32. 


UCIOnline 


1,899 


20,296 


49 




33. 


Wikilalk 


2,394,385 


r r^o 1 /tin 

5,021,410 


29 


Intra- 


34. 


Consulting* 


46 


879 


17 


organizational 


35. 


Freemans-1* 


34 


645 


23 




36. 


Freemans-2* 


34 


830 


23 




37. 


Manufacturing* 


77 


2,228 


17 




38. 


University* 


81 


817 


41 



Semantics oi A B 



l6l1 



!61 



A owns B 
A regulates B 
A regulates B 
A regulates B 

A is trusted by B 
A is trusted by B 
A is trusted by B 
A is trusted by B 
A was voted on by B 

A preys on B 
A preys on B 
A preys on B 
A preys on B 

B is produced from A 
B is produced from A 
B is produced from A 

_B is a function of A 
_B is a function of A 
B is a function of A 

B is within one synapse or 
gap junction distance from 
A 

Area A is connected to area 
B 

A is cited by B 
A is cited by B 

A links to B 
A links to B 
A links to B 
A links to B 

A sent messages to B 
A sent messages to B 
A sent messages to B 

A is followed by B 

A sent emails to B 

A edited the talk page of B 

on Wikipedia 

B turned to A for advice 
A was nominated by B on 
a questionnare as acquain- 
tance 

A was nominated by B on 
a questionnare as acquain- 
tance 

B turned to A for advice 
A was nominated by _B on a 
questionnare 



Table 2: Summary of the real networks analyzed in the paper, n denotes the number of 
nodes, m denotes the number of edges. Networks where the edges were reversed compared to 
the original publication are marked by an asterisk (*). 
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Critical Ordinary I I Distinguished 



1.0 - 
0.8 - 
0.6 - 




1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 



Figure 4: Fraction of distinguished (light gray), ordinary (dark gray) and critical (black) 
edges in the real networks studied in this paper. Numbers refer to the network indices in 
Table [cm 

C.3 Implementation 

An open-source implementation of the driver node calculations and the edge classification for 
arbitrary networks is provided at http://github.com/ntamas/netctrl. 
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